74 research outputs found
VEWS: A Wikipedia Vandal Early Warning System
We study the problem of detecting vandals on Wikipedia before any human or
known vandalism detection system reports flagging potential vandals so that
such users can be presented early to Wikipedia administrators. We leverage
multiple classical ML approaches, but develop 3 novel sets of features. Our
Wikipedia Vandal Behavior (WVB) approach uses a novel set of user editing
patterns as features to classify some users as vandals. Our Wikipedia
Transition Probability Matrix (WTPM) approach uses a set of features derived
from a transition probability matrix and then reduces it via a neural net
auto-encoder to classify some users as vandals. The VEWS approach merges the
previous two approaches. Without using any information (e.g. reverts) provided
by other users, these algorithms each have over 85% classification accuracy.
Moreover, when temporal recency is considered, accuracy goes to almost 90%. We
carry out detailed experiments on a new data set we have created consisting of
about 33K Wikipedia users (including both a black list and a white list of
editors) and containing 770K edits. We describe specific behaviors that
distinguish between vandals and non-vandals. We show that VEWS beats ClueBot NG
and STiki, the best known algorithms today for vandalism detection. Moreover,
VEWS detects far more vandals than ClueBot NG and on average, detects them 2.39
edits before ClueBot NG when both detect the vandal. However, we show that the
combination of VEWS and ClueBot NG can give a fully automated vandal early
warning system with even higher accuracy.Comment: To appear in Proceedings of the 21st ACM SIGKDD Conference of
Knowledge Discovery and Data Mining (KDD 2015
Modeling Dyadic Human Interaction using Sequential Neural Network
Social networks, involving people and their interactions are at core of human society. But many current computational social methods focus more on the individual than their interactions. Deep neural networks have been successfully applied to tasks such as natural language processing, dialog modeling, or analyzing sentiments in a conversation. In these areas, we will often encounter data that originate from multiple sources. These signals can interact with each other synchronously, but detecting such synchrony may prove challenging.
In this work we focus on investigating how deep neural network architectures can help us better understand synchrony in social contexts. We investigate different coupled sequential models such as an end-to-end connected gated recurrent unit (GRU), an inherently coupled GRU, message-passing, the role of attention and the use of transformer networks for coupling.
We evaluate the effectiveness of our coupling models on multiple datasets. We first test on synthesized sequential coupled data as a sanity-check and then move on to more realistic data. We test our models on three different real-world datasets collected in the context of various social interactions. In two of the datasets, we predict the rapport between two persons based on data extracted from the video of them interacting. In the third dataset, we predict friendship/familiarity between two people based on their interaction. We present the findings from the work and conclude that the coupled transformer network performs the best
Thermal one-point functions: CFT's with fermions, large and large spin
We apply the OPE inversion formula on thermal two-point functions of fermions
to obtain thermal one-point function of fermion bi-linears appearing in the
corresponding OPE. We primarily focus on the OPE channel which contains the
stress tensor of the theory. We apply our formalism to the mean field theory of
fermions and verify that the inversion formula reproduces the spectrum as well
as their corresponding thermal one-point functions. We then examine the large
critical Gross-Neveu model in dimensions with even and at
finite temperature. We show that stress tensor evaluated from the inversion
formula agrees with that evaluated from the partition function at the critical
point. We demonstrate the expectation values of 3 different classes of higher
spin currents are all related to each other by numerical constants, spin and
the thermal mass. We evaluate the ratio of the thermal expectation values of
higher spin currents at the critical point to the Gaussian fixed point or the
Stefan-Boltzmann result, both for the large critical model and the
Gross-Neveu model in odd dimensions. This ratio is always less than one and it
approaches unity on increasing the spin with the dimension held fixed. The
ratio however approaches zero when the dimension is increased with the spin
held fixed.Comment: 46 pages, 8 figures, typos correcte
Thermal one point functions, large and interior geometry of black holes
We study thermal one point functions of massive scalars in black
holes. These are induced by coupling the scalar to either the Weyl tensor
squared or the Gauss-Bonnet term. Grinberg and Maldacena argued that the one
point functions sourced by the Weyl tensor exponentiate in the limit of large
scalar masses and they contain information of the black hole geometry behind
the horizon. We observe that the one point functions behave identically in this
limit for either of the couplings mentioned earlier. We show that in an
appropriate large limit, the one point function for the charged black hole
in can be obtained exactly. These black holes in general contain an
inner horizon. We show that the one point function exponentiates and contains
the information of both the proper time between the outer horizon to the inner
horizon as well as the proper length from the inner horizon to the singularity.
We also show that Gauss-Bonnet coupling induced one point functions in
black holes with hyperbolic horizons behave as anticipated by
Grinberg-Maldacena. Finally, we study the one point functions in the background
of rotating BTZ black holes induced by the cubic coupling of scalars.Comment: 40 pages, 4 figures, 1 table, reference added, typos correcte
Linguistic Harbingers of Betrayal: A Case Study on an Online Strategy Game
Interpersonal relations are fickle, with close friendships often dissolving
into enmity. In this work, we explore linguistic cues that presage such
transitions by studying dyadic interactions in an online strategy game where
players form alliances and break those alliances through betrayal. We
characterize friendships that are unlikely to last and examine temporal
patterns that foretell betrayal.
We reveal that subtle signs of imminent betrayal are encoded in the
conversational patterns of the dyad, even if the victim is not aware of the
relationship's fate. In particular, we find that lasting friendships exhibit a
form of balance that manifests itself through language. In contrast, sudden
changes in the balance of certain conversational attributes---such as positive
sentiment, politeness, or focus on future planning---signal impending betrayal.Comment: To appear at ACL 2015. 10pp, 4 fig. Data and other info available at
http://vene.ro/betrayal
Characterization and Detection of Malicious Behavior on the Web
Web platforms enable unprecedented speed and ease in transmission of knowledge, and allow users to communicate and shape opinions. However, the safety, usability and reliability of these platforms is compromised by the prevalence of online malicious behavior -- for example 40% of users have experienced online harassment. This is present in the form of malicious users, such as trolls, sockpuppets and vandals, and misinformation, such as hoaxes and fraudulent reviews. This thesis presents research spanning two aspects of malicious behavior: characterization of their behavioral properties, and development of algorithms and models for detecting them.
We characterize the behavior of malicious users and misinformation in terms of their activity, temporal frequency of actions, network connections to other entities, linguistic properties of how they write, and community feedback received from others. We find several striking characteristics of malicious behavior that are very distinct from those of benign behavior. For instance, we find that vandals and fraudulent reviewers are faster in their actions compared to benign editors and reviewers, respectively. Hoax articles are long pieces of plain text that are less coherent and created by more recent editors, compared to non-hoax articles. We find that sockpuppets are created that vary in their deceptiveness (i.e., whether they pretend to be different users) and their supportiveness (i.e., if they support arguments of other sockpuppets controlled by the same user).
We create a suite of feature based and graph based algorithms to efficiently detect malicious from benign behavior. We first create the first vandal early warning system that accurately predicts vandals using very few edits. Next, based on the properties of Wikipedia articles, we develop a supervised machine learning classifier to predict whether an article is a hoax, and another that predicts whether a pair of accounts belongs to the same user, both with very high accuracy. We develop a graph-based decluttering algorithm that iteratively removes suspicious edges that malicious users use to masquerade as benign users, which outperforms existing graph algorithms to detect trolls. And finally, we develop an efficient graph-based algorithm to assess the fairness of all reviewers, reliability of all ratings, and goodness of all products, simultaneously, in a rating network, and incorporate penalties for suspicious behavior.
Overall, in this thesis, we develop a suite of five models and algorithms to accurately identify and predict several distinct types of malicious behavior -- namely, vandals, hoaxes, sockpuppets, trolls and fraudulent reviewers -- in multiple web platforms.
The analysis leading to the algorithms develops an interpretable understanding of malicious behavior on the web
- β¦